Partition sampling: an active learning selection strategy for large database annotation
نویسندگان
چکیده
Annotating a video database requires an intensive, time consuming and error prone human effort. However, this is a mandatory task to efficiently analyze multimedia contents. We propose an new selection strategy for active learning methods to minimize human effort in labeling a large database of video sequences. Formally, active learning is a process where new unlabeled samples are iteratively selected, presented to users for annotation and added to the training set. The major problem is then to find the best selection function to quickly reach high classification accuracy. We will show that existing active learning approaches using selective sampling do not maintain their performances when the number of selected samples per iteration increases. The presented selection strategy attempt to provide a solution to this problem. In practice, selecting many samples offers many advantages when dealing with a large amount of data; among them the possibility to share the annotation effort between several users. Finally we attempt to tackle the more realistic and challenging task of multiple label annotation. This would reduce to greater extend the human effort for labeling.
منابع مشابه
Partition Sampling for Active Video Database Annotation
Annotating a video-database requires an intensive human effort that is time consuming and error prone. However this task is mandatory to bridge the gap between low-level video features and the semantic content. We propose a partition sampling active learning method to minimize human effort in labeling. Formally, active learning is a process where new unlabeled samples are iteratively selected a...
متن کاملActive learning for sense annotation
This article describes a real (nonsynthetic) active-learning experiment to obtain supersense annotations for Danish. We compare two instance selection strategies, namely lowest-prediction confidence (MAX), and sampling from the confidence distribution (SAMPLE). We evaluate their performance during the annotation process, across domains for the final resulting system, as well as against in-domai...
متن کاملActive Learning for Coreference Resolution
We present an active learning method for coreference resolution that is novel in three respects. (i) It uses bootstrapped neighborhood pooling, which ensures a class-balanced pool even though gold labels are not available. (ii) It employs neighborhood selection, a selection strategy that ensures coverage of both positive and negative links for selected markables. (iii) It is based on a query-by...
متن کاملBidding Strategy on Demand Side Using Eligibility Traces Algorithm
Restructuring in the power industry is followed by splitting different parts and creating a competition between purchasing and selling sections. As a consequence, through an active participation in the energy market, the service provider companies and large consumers create a context for overcoming the problems resulted from lack of demand side participation in the market. The most prominent ch...
متن کاملA Comparison of Models for Cost-Sensitive Active Learning
Active Learning (AL) is a selective sampling strategy which has been shown to be particularly cost-efficient by drastically reducing the amount of training data to be manually annotated. For the annotation of natural language data, cost efficiency is usually measured in terms of the number of tokens to be considered. This measure, assuming uniform costs for all tokens involved, is, from a lingu...
متن کامل